Kernel bandwidth estimation for non-parametric density estimation: a comparative study
نویسندگان
چکیده
We investigate the performance of conventional bandwidth estimators for non-parametric kernel density estimation on a number of representative pattern-recognition tasks, to gain a better understanding of the behaviour of these estimators in high-dimensional spaces. We show that there are several regularities in the relative performance of conventional kernel bandwidth estimators across different tasks and dimensionalities. In particular, we find that the Silverman ruleof-thumb and maximal-smoothing principle estimators consistently perform competitively on most tasks and dimensions for the datasets considered. Keywords—non-parametric density estimation; kernel density estimation; kernel bandwidth estimation, pattern recognition I. KERNEL DENSITY ESTIMATION Kernel Density Estimators (KDEs) estimate the nonparametric density function of a set of D-dimensional iid data samples, X, as the sum of parametric functions, where the parametric function is known as a kernel and is centred on each sample. More formally, the density of a data point x, can be estimated as
منابع مشابه
Asymptotic Behaviors of Nearest Neighbor Kernel Density Estimator in Left-truncated Data
Kernel density estimators are the basic tools for density estimation in non-parametric statistics. The k-nearest neighbor kernel estimators represent a special form of kernel density estimators, in which the bandwidth is varied depending on the location of the sample points. In this paper, we initially introduce the k-nearest neighbor kernel density estimator in the random left-truncatio...
متن کاملتخمین احتمال بزرگی زمینلغزشهای رخداده در حوزه آبخیز پیوهژن (استان خراسان رضوی)
Knowing the number, area, and frequency of landslides occurred in each area has a prominent role in the long-term evolution of area dominated by landslides and can be used for analyzing of susceptibility, hazard, and risk. In this regard, the current research is trying to consider identified landslides size probability in the Pivejan Watershed, Razavi Khorasan Province. In the first step, lands...
متن کاملتشخیص سرطان پستان با استفاده از برآورد ناپارمتری چگالی احتمال مبتنی بر روشهای هستهای
Introduction: Breast cancer is the most common cancer in women. An accurate and reliable system for early diagnosis of benign or malignant tumors seems necessary. We can design new methods using the results of FNA and data mining and machine learning techniques for early diagnosis of breast cancer which able to detection of breast cancer with high accuracy. Materials and Methods: In this study,...
متن کاملBayesian Approaches to Non-parametric Estimation of Densities on the Unit Interval
This paper investigates nonparametric estimation of density on [0,1]. The kernel estimator of density on [0,1] has been found to be sensitive to both bandwidth and kernel. This paper proposes a unified Bayesian framework for choosing both the bandwidth and kernel function. In a simulation study, the Bayesian bandwidth estimator performed better than others, and kernel estimators were sensitive ...
متن کاملKernel Density Estimation for An Anomaly Based Intrusion Detection System
This paper presents a new nonparametric method to simulate probability density functions of some random variables raised in characterizing an anomaly based intrusion detection system (ABIDS). A group of kernel density estimators is constructed and the criterions for bandwidth selection are discussed. In addition, statistical parameters of these distributions are computed, which can be used dire...
متن کامل